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Abstract 

In this first part, a computable outer bound is proved for the multiterminal source coding problem, for a 
setup with two encoders, discrete memoryless sources, and bounded distortion measures. 
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I. Introduction 

A. The Problem of Multiterminal Source Coding 

Consider two dependent sources X and Y , with joint distribution p{xy). These sources are to be encoded 
by two separate encoders, each of which observes only one of them, and are to be decoded by a single joint 
decoder. X is encoded at rate Ri and with average distortion Di, and Y is encoded at rate R2 and with 
average distortion L>2- This setup is illustrated in Fig. 1. 



X" ► Encoder 1 

p{xy) 



JG{1...2"'^'} 



yn ► Encoder 2 



je{1...2"^ 



Decoder 



X' 



yr 



Fig. 1. System setup for multiterminal source coding. 

In the classical multiterminal source coding problem, as formulated in [4], [19], the goal is to determine 
the region of all achievable rate-distortion tuples {Ri,R2,Di,D2). Although relatively simple to describe 
(a formal description is given later), the multiterminal source coding problem was one of the long-standing 
open problems in information theory - see, e.g., [12, pg. 443]. Furthermore, besides its historical interest, 
this problem also comes up naturally in the context of a sensor networking problem of interest to us [3]. 

Multiterminal source coding has rich history, among which fundamental contributions, in chronological 
order, are the works of: a) Dobrushin-Tsybakhov [15], with the first rate-distortion problem with a Markov 
chain constraint; b) Slepian-Wolf [18], with the formulation and solution to the first distributed source coding 
problem, and Cover [11], with a simpler proof of the Slepian-Wolf result, a proof method widely in use 
today; c) Ahlswede-Korner [1] and Wyner [22], with the first use of an auxiliary random variable to describe 
the rate region of a source coding problem, and with it the need to introduce proof methods to bound their 
cardinality; d) Wyner-Ziv [23], with the first characterization of a multiterminal rate-distortion function; 
e) Berger-Tung [4], [19], with the first formulation and partial results on the multiterminal source coding 
problem as formulated in Fig. 1; and f) Berger-Yeung [7], [24], with a complete solution to a more general 
form of the Wyner-Ziv problem. For details on these, and on many more important contributions, as well 
as for historical information on the problem, the reader is referred to [6]. 

The setup of Fig. 1 represents what we feel was the simplest yet unsolved instance of a multiterminal 
source coding problem. The problem of Fig. 1, and the CEO problem [8] are, to the best of our knowledge, the 
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last two known special cases of the general entropy characterization of problem of Csiszar and Kbmer [13] 
that remained unsolved. This hierarchy of problems is illustrated in Fig. 2. 
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Fig. 2. A hierarchy of problems in multiterminal source coding with two encoders and one decoder: an arrow from problem 
X to problem Y indicates that X is a special case of Y, in the sense that a solution to Y automatically provides a solution 
to X. Abbreviations - SC: two-terminal lossless source coding; RD: two-terminal rate-distortion [17]; SW: distributed coding of 
dependent sources [18]; AKTW: source coding with side information [1], [22]; WZ: rate-distortion with side information [23]; BY: 
the Berger-Yeung extension of WZ theory [7]; DT: rate-distortion with a remote source [15]; BHOTW: a rate-distortion formulation 
of the Ahlswede-Korner-Wyner problem [5]; CEO: the CEO problem [8]; MTRD: the problem setup of Fig. 1; EC: the entropy 
characterization problem [13]. Asterisks are used to indicate problems whose solution was previously known. 

It should be pointed out though that the setup of Fig. 1 is by no means the most general formulation 
of a multiterminal source coding problem we could have given, there are many other ways in which we 
could have chosen to formulate these problems: we could have chosen a network with M encoders and a 
single decoder which attempts to reconstruct L different functions of the sources, we could have considered 
continuous-alphabet and/or general ergodic sources, we could have considered feedback and interactive 
communication, we could have studied how this problem relates to the network coding problem, and we 
could have considered network topologies with multiple decoders as well. All these alternative possible 
formulations are discussed in detail in [6]. 

B. Difficulties in Proving a Converse 

Among the limited number of references mentioned above, we included the Berger-Tung bounds [4], [19]. 
These bounds do provide the best known descriptions of the region of achievable rates for the problem setup 
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of Fig. 1/ and so we elaborate on those now. 

Proposition 1 (Berger-Tung Bounds): Fix {Di,D2). Let X and Y be two sources out of which pairs 
of sequences (X", F") are drawn i.i.d. '-^ p{xy); and let U and V be auxiliary variables defined over 
alphabets U and V, such that there exist functions 71 : U x V —>^ X and 72 : Z/^ x V ^ J', for which 
E [di{X,ji{UV))] < Di and E [d2{Y,-/2{UV))] < D2. Consider rates (i?i,i?2), such that Ri > I{XY A 
U\V), R2 > I{XY A V\U), and Ri + R2 > I{XY A UV), for some joint distribution p{xyuv). Now: 

• for any p{xyuv) that satisfies a Markov chain of the form U — X — Y — V, all rates (i?i, i?2) obtained 
for any such p are achievable; 

• if there exists a p{xyuv) that satisfies two Markov chains of the form U — X — Y and X — Y — V, 
then if we consider the union of the set of rates defined for each such p{xyuv), we must have that any 
achievable rates are included in that union; 

that is, the first condition defines an inner bound, and the second an outer bound to the rate region. D 

The regions defined by these bounds, when regarded as images of maps that transform probability 
distributions into rate pairs, have a property that is a source of many difficulties: the mutual information 
expressions that define the inner and the outer bounds are identical, it is only the domains of the two maps 
that differ; as such, comparing the resulting regions is difficult. This difference between the inner and outer 
bounds has been the state of affairs in multiterminal source coding, since 1978. 

A close examination of these distributions suggested to us that the gap might not be due to a suboptimal 
coding strategy used in the inner bound, but instead that perhaps the outer bound allows for the inclusion 
of dependencies that cannot be physically realized by any distributed code. Consider these distributions: 

• For the inner bound, p{xyuv) - p{xy)p{u\x)p{v\y). 

• For the outer bound, p{xyuv) - p{xy)p{u\x)p{v\yxu) - p{xy)p{v\y)p{u\xyv). 

If we choose to interpret U and V as instantaneous descriptions of encodings of X and Y, then we see 
that the outer bound says that the encoding V is allowed to contain information about X beyond that which 
can be extracted from Y, and Ukewise for U and Y? Motivated by this observation, in the first part of this 

'We note that recently, a new outer bound has been proposed for a version of multiterminal source coding that contains the 
formulation of [4], [19] considered here as a special case [20], [21]. The new bound has many desirable properties: it unifies known 
bounds custom developed for seemingly different problems, and it provides a conclusive answer for a previously unsolved instance. 
However, when specialized to our two-encoder setup, it is unclear if the new bound provides an improvement over the Berger-Tung 
outer bound. So, due to the simplicity of the latter, we have chosen here to focus on that one instead of on the more modern form. 

^Note: this interpretation comes from the inner bound, and is only justified for blocks. [/" does represent an encoding of X" , 
but it would be incorrect to say that the variable U is an encoding of X (and likewise for V and Y). These insights can only be 
carried so far, but at this point we are only trying to build some intuition, and thus it is permissible to take such liberties. 
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work we set ourselves the goal of finding a new outer bound. 



C. An Interpretation of Distributed Rate-Distortion Codes as Constrained Source Covers 

In Part I of this paper we present a finitely parameterized outer bound for the region of achievable rates of 
the multiterminal source coding problem of Fig. 1 , based on what we believe is an original proof technique. 
Some highlights of that proof method, formally developed in later sections, are provided here. 

1) Rate-Distortion Codes = Source Covers: Our proof tightens existing converses by means of identifying 
a constraint that all codes are subject to, but that is not captured by any existing outer bound. To explain 
what the constraint is, the easiest way to get started is by drawing an analogy to classical, two-terminal 
rate-distortion codes. 

In the standard, two-terminal rate-distortion problem, a generic code consists of the following elements: 

• A block length n. 

, A cover {Si : i = 1...2"^} of the source A"". 

• A reconstruction sequence x"(i), associated to each cover element Sj. 

Given this description, an encoder / : X^ — > {1...2"^} makes /(x") = i for some source sequence x" 
and some index i, if x" € Sj, with ties broken arbitrarily; a decoder g : {1...2"^} — > Af" simply maps 
g{i) = x"(i). And we say that the encoder/decoder pair (/,<?) satisfies a distortion constraint D if, roughly, 
P( d(x", (7(/(x"))) < Dj « 1, for all n large enough. Such a representation is illustrated in Fig. 3. 




I G {1...2"^} 
Encoder output 



Reconstruction sequence 



Source sequences 



Fig. 3. Cover-based representation of a classical rate-distortion code. 

In an analogous manner, we specify an arbitrary distributed rate-distortion code as follows: 

• A block length n. 

• Two covers: 

- A cover {Si,i : i = 1...2"^i} of the source A'". 

- A cover {S2,j : j = 1...2"-^^} of the source y^. 



November 12, 2006. 



DRAFT 



l,i 



X S 



2j 



: I 



1...2"-^i, 



1...2 



ni?2 



} of the 



Indirectly, these two covers specify a cover Sjj = |S 
product alphabet X"^ x y^. 
• For each cover element Sy, we specify two reconstruction sequences (x'^(ij),y"(fj)). 
Given this description, an encoder /i : A'" ^^ {1...2"^i} for node 1 makes /i(x") = i for some 
source sequence x" and some index i, if x" € Si^j, with ties broken arbitrarily (and similarly for an 
encoder /2 at node 2); a decoder (j( : {1...2"^i} x {1...2"^2} ^ X^ x j)" simply maps g{i,j) = 
(x"(?j),y"(?j)). And we say that the distributed code {fi,f2,g) satisfies two distortion constraints Di 
and D2 if, roughly, P((ii(x",x") < Di and d2(y",y") < D2] ~ 1, for all n large enough, and for 



x"y" 



(7(/i(x"), /2(y")). Such a representation is illustrated in Fig. 4. 




Reconstruction sequence 



i e {l...2"^i} 

Encoder output 



Source sequences 




S'^iij] 



Reconstruction sequence 



Encoder output 




AT^xy 



Source sequences 



Fig. 4. Cover-based representation of a distributed rate-distortion code. 

2) Constraints on the Structure of Source Covers: Our main insight is that, whereas in the classical 
problem any arbitrary cover defines a valid rate-distortion code, in multiterminal source coding this is no 
longer the case: covers of the product source X^ x y^ only of the form Sij = Si^j x S2J can be realized 
by distributed codes. The significance of this requirement is illustrated with an example in Fig. 5. 

From the informal argument of Fig. 5, we see how the fact that distributed codes produce covers only of 
the form Sjj = Si^j x 82^ results in constraints on the sets used to cover the typical set Tl^f^XYy. there 
are certain groups of typical sequences that cannot be broken, in the sense that either all of them appear 
together in a cover element Sy, or none of them appear. We believe this is significant for two main reasons: 

• If we compare to a classical rate-distortion code, this constraint is clearly not there. Provided the 
distortion constraints are met, a classical code would be able to split the typical set into distortion balls, 
without any further constraints. 
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Fig. 5. An example, to illustrate the significance of the requirement that cover elements Sij take a product form. Let X = y = {0, 1}, 
and p{xy) — p{x)p{y\x) specified by a p{x) such that P{X = 0) = P{X — 1) = ^, and p{y\x) a binary symmetric channel with 
crossover probability pc- Left: for each typical x", there is a "ring" of y"'s jointly typical with it, centered at x" and of radius 
~ npc- Right: consider pairs (x^y") and (x2y2) in Sij; dashed circles denote distortion balls centered at x"(ij) and y"(Jj) 
(with the centers omitted, for clarity), and dark shaded regions denote the intersection of two rings. Suppose now that all four pairs 
(x"yi), (xjyj) (x2y"), and (x2y2) are in T"[XY). Because Sij = Si,i x S2,j, all four pairs must be in Sij as well: the 
decoder does not have enough information to discriminate among these pairs. No such constraint exists with a centralized encoder. 



• More fundamentally though, we view this constraint as a form of "independence," reminiscent to us 
of the extra independence assumption required by the long Markov chain used in the definition of the 
Berger-Tung inner bound, which is not there in the definition of the outer bound, as highlighted in 
Section I-B earUer. 

This latter observation is perhaps the strongest piece of evidence that suggested to us that the Berger-Tung 
inner bound might be tight. 

D. Main Contributions and Organization of the Paper 

The main contribution presented in Part I of this paper is the development of an outer bound to the 
region of achievable rates for multiterminal source coding. This outer bound has two salient properties that 
distinguish it from existing bounds in the literature: 

• it is based on explicitly modeling a constraint on the structure of codes that, as we understand things, 
had not been captured by any previously developed bound; 

• and also unlike existing bounds, it is finitely parameterized. 

We believe that this outer bound coincides with the set of achievable rates defined by the Berger-Tung inner 
bound. This issue is thoroughly explored in Part II of this paper, in the context of our study of algorithmic 
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issues involved in the effective computation of this bound. 

The rest of this paper is organized as follows. In Section 11 we define our notation, and state our main 
result. In Section EI we state and prove some auxiliary lemmas that greatly simplify the proof of the main 
theorem, a proof that is fully developed in Section IV. The paper concludes with an extensive discussion 
on our main result and its implications, in Section V. 

II. Preliminaries 
A. Definitions and Notation 

First, a word about notation. Random variables are denoted with capital letters, e.g., X. Realizations of 
these variables are denoted with lower case letters: e.g., X = x means that the random variable X takes on 
the value x. Script letters are typically used to denote alphabets, e.g., the random variable X takes values 
on an alphabet X. The alphabets of all random variables considered in this work are always assumed finite. 
Sets in general are denoted by capital boldface symbols, e.g., S. The size of a set is denoted by |S|. A 
probability mass function on ^ is denoted by px{x), or simply p{x) when the variable that it applies to 
is clear from the context. Sequences of elements from an alphabet X are denoted by boldface symbols 
x", and its i-th element by xj; this sequence is an element of the extension alphabet X"-. The expression 
x^'" denotes a subsequence of x" consisting of the elements [xi,Xj_|_i, ...,Xj], whenever i < j, otherwise it 
denotes an empty sequence; also, sometimes the length n of the sequence will be clear from the context, 
and then we simply write x^ instead of x^'", whenever this does not cause confusion. The expression x~*'" 
denotes the sequence [xi, ...,Xj_i,Xj_|_i, ...,x„], and again, we write this as x~* whenever n is clear from 
the context. The same conventions are followed for sequences of random variables. 

Given a boolean predicate 6(x) depending on a variable x, we write l{6(x)} to denote the indicator function 
for the predicate: this is a function that takes the value 1 whenever 6(x) is true, and whenever it is false. 
Given a sequence x" € X"-, and an element x G ^, we denote by iV(2;;x") the type of x", defined as 
N{x; x") = Y17=i l{x,=a;}- Then, for any random variable X, any real number e > 0, and any integer n > 0, 
we denote by T"(X) the strongly typical set of X with parameters n and e, defined as 

r,"(X) = |x"G^"|VxG^:|iiV(^;x")-px(x)|<^}. 

In some situations, we need to compare typical sets defined for the same set of variables, but induced by 
different distributions on these variables. To resolve this ambiguity, we denote by T" (X) [px] the typical set 
corresponding to a distribution px- The same convention is followed when there is similar ambiguity in the 
evaluation of entropies (denoted H(^X^ [px]), and mutual information expressions (denoted /(X AY) [pxy])- 
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Vector extensions iV(xy;x"y"), Tp{XY), etc., are defined by considering the same definitions as above, 
over a suitable product alphablet X xy. Similarly, given two random variables X and Y, a joint probability 
mass function pxY{xy), and a sequence y", we denote by r"(X|y") the conditional typical set of X given 
y", defined as 

r,"(X|y") = {x"GA'"|vxe^,yG3^:|iiV(xy;x"y")-pxy(xy)|<^}. 

We will also consider situations where we need to refer to the set of all typical sequences which are jointly 
typical with at least one of a group. In that case, for a set S C 3^", we write 

Tf (X|S) = U Tf (X|y"). 
y-es 

Given any e > 0, many times we require to make reference to quantities which are deterministic functions 
of e, having the property that as e — > 0, these quantities also vanish. Such small quantities are denoted by 
ei, €2, e, e, e', e", etc.; and the value of e on which they depend is either mentioned explicitly or should be 
clear from the context. 

Consider two random variables X and Y with joint distribution p{xy). T"(X) is the usual typical set. 
Sometimes we also need to consider the set S^yi^) - j^" T^{Y\x^) / 0|. Clearly, S^yi^) ^ T^{X). 
But we also know from [25, Ch. 5], that ^ log |5"y(X)| — H{X) < e. That is, although there may exist 
strongly typical sequences x" for which there are no sequences y" jointly typical with them, these x"'s 
form a set of vanishing measure. 

Some standard operations on sets are intersection (A n B), union (A U B), complementation (A*^) and 
difference (A\B). The set of all subsets of S is denoted by 2^. The convex closure of S is denoted by 
S = Pi {S' I S C S' A S' is closed and convex}. Given a set S, a cover of size A^ of S is a collection of sets 
5 = {Si : i = I...N], such that S C U^Ii Sj. If a cover further satisfies that Si n Sj = (1 < ? / j < N), 
and that S = IJi=i Sj, then we say that 5 is a partition of S. 

Consider two sets, A and B, for which P(B|A) = 1: clearly, P(A n B) = -P(A), and hence A C B, 
except perhaps for a set of measure zero. If instead we have a slightly weaker condition, namely that 
P(B| a) > 1 — e, then we say that A is weakly included in B, and we denote this by A C^ B. 

B. Distributed Rate-Distortion Codes 

Consider two sources X and Y , out of which random pairs of sequences (X", y") are drawn i.i.d. ~ p{xy) 
from two finite alphabets, denoted X and y, and reproduced with elements of two other alphabets X and 
3^. The two sources X and Y are processed by two separate encoders. The encoders are two functions: 

/i:^" ^ {1,2,. ..,2"^^} and ^ : 3^" ^ {l, 2, . . . , 2"^^}. 
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These encoding functions map a block of n source symbols to discrete indices. The decoder is a function 

5: {1,2,... ,2"^^} X {1,2,. ..,2"^^} ^ i-^xj)", 

which maps a pair of indices into two blocks of reconstructed source sequences. 

Two distortion measures di : A^ x X —^ [0, co) and ^2 : 3^ x 3^ ^ [0, 00) are used to define reconstruction 
quality. Since 00 is not in their range and the alphabets are finite, these distortion measures are necessarily 
bounded, so we denote these largest values by max di{x,x) = di,MAx> max d2{y,y) — c^2,max> and 

max((ii,MAx,<i2,MAx) = c^max < 00. (i5'(x",x") = ^YH=idi{xi,Xi) and (i^(y",y") = \YJi=id2[yi,yi) 
denote the corresponding extensions to blocks. Oftentimes, the symbols di and d2 are used for both the 
single-letter and the block extensions; which is the intended meaning should be clear from the context. For 
any distortion measure d : Af" x X^ -^ [0, 00), an element x" € Af" and a number D > 0, a "ball" of radius 
D centered at x" is the set 5(x", D) = {x" G X"" \ (i(x", x")) < D] (and similarly for a ball 5(y", D)). 
For any D, D~^ is shorthand for D + e, for an e that is always clear from the context. 

Fix now encoders and decoder (/i, f2,g) operating on blocks of length n, and a real number e > 0. If 
we have that 



n,,n\ 



P x"y 



(x"y") =5(/i(x"),/2(y")) A di(x",x«) <D+ A d2(y",r) <i^2+}) > l - e, (1) 



then we say that {fi-,f2,g) satisfies the (e,L>i,D2) -distortion constraint.^ 

C. Achievable Rates 

A (2"^i, 2"^% n, e, 1)1,1)2) distributed rate-distortion code is defined by a block length n, a parameter 
e > 0, two encoding functions /i and /2 with ranges of size 2"^^ and 2"^^ and a decoding function g, 
such that {fi,f2,g) satisfies the (e, Di, D2) -distortion constraints. 

We say that the rate-distortion tuple {Ri, R2,Di,D2) is e-achievable if a [2^^\2"'^^ ,n,e,Di, D2) 
distributed code exists; for fixed parameters (e, Di, D2), we denote the set of all e-achievable pairs (i?i, R2) 
by TZe{Di, 02). Then, the rate region TZ*{Di,D2) of the two sources is defined by 

TZ*{Di,D2) = fl n,{Di,D2). 

e>0 

Now we are going to describe a different set of rates. Define Plb to be the set of all probability distributions 
p{xyxy) over X x y x X x y, such that: 

^This form of a distortion constraint is referred to as an e-fidelity criterion in [14, pg. 123]. An alternative form to this "local" 
condition is given by requiring a "global" average constraint of tlie form E [di(x",x")] < D^ and E [d2(y",y")] < D^- For 
the purpose of our developments, the local form lends itself more readily to analysis, and hence is the one we adopt. 
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10 



p{xyxy) = p{xy)p{x\xy)p{y\xy) (that is, X — XY — Y forms a Markov chain); 
PXY = YlxyPi^y)PiAxy)p{y\xy) (pxy is the source); 



and E 



di{X,X) 



< Di and E 



d2{Y,Y) 



<Do. 



Then, for each p G Plb, define 

n{Di,D2,p) 



(i?l,i?2 



-Ri > I{X AXY\Y)[p] 
R2 > I{Y AXY\X)[p] }, 
R1+R2 > l{XYAXY)[p] 
and define also TZ°{Di, D2) = Upep T^iDi, D2,p). Now we are ready to state our outer bound. 

D. Statement of an Outer Bound 



Theorem 1: 


n*{Di,D2) 




n 


C n°{Di,D2). 



The proof of this theorem is given in Section IV. Before that, and next in Section III, we develop a 
number of observations and auxliary results to be used in the main proof. 

III. Some Useful Observations and Auxiliary Results 
A. Distributed Rate-Distortion Codes as Constrained Source Covers 

1) Distributed Source Covers: An equivalent representation for a generic (2"^^ , 2"^^ , n, e, Di,D2) code 
is given as follows: 

. Two covers: Si = {Si,, : i = 1...2"^i} of Af", and ^2 = {82^ : j = 1...2"-^=} of y. Any code with 
encoders /i and /2 can be represented in terms of two such covers, by considering fi^{i) = Si^j and 



(Note: these two covers define a cover S = (5i,52) of X"- x y^, with elements S^ 



n,i 



X S 



2j' 



for {i,j) e {1...2"-^i} X {1...2"-f^=}.) 
• A pair of reconstruction sequences {'x."' {ij) , y^ (ij)) = g{i,j) associated to each cover element Sij of 

the product source, for all {i,j) e {1...2''^^} x {1...2"^^}. 
In general, whenever we refer to a distributed rate-distortion code, we use interchangeably the earher 
representation in terms of two encoders and one decoder, and this representation in terms of covers. 

''Note that, strictly speaking, this definition is correct only when 5 is a partition. Occasionally we might abuse the notation and 
still refer to the code specified by a cover, with the understanding that in such cases ties (of the form of a source sequence being 
part of two different cover elements) are broken arbitrarily. This should not cause any confusion. 
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2) Distributed Typical Sets: As highlighted in the Introduction, it turns out that covers Sjj of the product 
source Af" x y^ are constrained beyond the requirements imposed by the fidelity criteria. That "extra" 
structure is described by Proposition 2. 



Proposition 2: For any cover S of X"^ x y^ defined by some {2"-^^ ,2'^^^ ,n, e, Di, D2) distributed 
rate-distortion code, and for any {i,j) G {1...2"-^'} x {1...2"^^}, x" G Si^j and y" G 82,^, then it 
must be the case that either (x"y") G 8,^- n T^ (XY) or (x"y") ^ Tf (XY) . D 



Proof. This is rather straightforward. Take any x" G 81^ and y"^ G 82^. Then: 

• by construction, (x"y") G Sij', 

. either (x"y") G Tp{XY) or (x"y") Tp{XY) - a tautology; 

. if (x'^y") G T^{XY), then (x"y'^) G 8^^ n T^{XY), and therefore the proposition is proved; 

• and if instead, (x"y") rj^(Xy), then the proposition is proved too. ■ 

Proposition 2 formally states the property of covers arising from distributed codes discussed informally 
in the Introduction (cf. Sec. I-C.l): all combinations of an x" sequence in Si j and a y*^ sequence in 82^, 
if they are jointly typical, must appear in Sij n T" {XY^ - the decoder does not have enough information 
to discriminate among such pairs. 

We now introduce a new definition. Consider any subset 8 C Tp(^XY^ for which, for any (x",y") G 8 
and (x",y") G 8, we have that either (x"y") G 8 or (x"y") T^{XY) - that is, the property of Prop. 2 
holds for 8. In this case, we say that 8 is is a distributed typical set. 

Clearly there are "interesting" distributed typical sets, the concept is not vacuous: 

. all sets of the form 8 = {(x"y")}, with (x"y") G Tp[XY), are distributed typical sets; 

. for any 81 C ^" and any 82 C y"-, 8 = [81 x 82] n r;'(Xy) is a distributed typical set. 
The last example provides a natural way of systematically constructing distributed typical sets. 

3) Source Covers Made of Distributed Typical Sets: We show next that in multiterminal source coding, 
the source must be covered with distributed typical sets in which each of the two components of the set 
gets specified by a different encoder. 
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Consider a length n (/i,/2,5) code, satisfying the (e, -Di,D2) -distortion constraint of eqn. (1): 

P({(x"y") I (x"r) =5(/i(x"),/2(y")) A di(x",i") <Z?+ A d^i^^r) < Dt}) 

^^ P({(x"y") I (i"r) = 5(/i(x"),/2(y")) A cii(x",i") < Dt A ci2(y",r) < 1^2+} n U S. 

= p( U {(x"y") I (x"y") =5(/i(x"),/2(y")) A di(x",x") < Df A rf2(y",y") < 1^2+} nS,, 



(ij) 



(fe) 



P{ U {(^"y") I rfi(x",i"(u)) < D+ A x" G Si, A d2{y^,rm) < Dt A y^ G S2,,} 

= P(U [Si,,xS2,,]n[i3(x"(ij),Z)+)xS(y"(u),l)2+)]) 
(c) 

where (a) follows from { (x"y") (x"y") = 5(/i(x"), /2(y")) A di(x",x") < D+ A (i2(y",y") < 
L>+} C A"" X 3^" C |J(.^.) Sij-, (b) follows from S^j = Si,i x S2,j; and (c) follows from the fact that the 
code under consideration satisfies the distortion constraint of eqn. (1). We also know, from basic properties 
of typical sets, that 

p(t^{XY)) > 1-e, 

and so, if we define Sij = [Si,i x S2j] n T^{XY), we see that 

P( IJ [Si,xS2,,] n [B{i^{ij),Dt)xB{r{ij),Dt)] DT^iXY) 
ihj) 



= P IJ S,, n [B{i^iij),Dt)xB{r{iJ),Dt)] 

> 1-e; 



(2) 



that is, since Sij is a distributed typical set, the source must be covered with the fraction of such sets contained 
in pairs of balls centered at the reconstruction sequences; furthermore, we note that each component of the 
distributed typical set must be specified completely by each encoder. 



B. The "Reverse " Markov Lemma 

1) The Standard Form: Lemma 1 is the Markov lemma as stated in [4, pg. 202], in our own notation. 
Lemma 1 (Markov): Consider a Markov chain of the form X — Z — Y. Then, for all e > 0, 

lim p((x",y") erf(xy) (Z",y") GTf(zy)) = 1, 
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for any sequence y" G 3^". D 

The lemma says that for every y" G 3^", j/the random vector (Z", y") G T" [ZY^ , then the random vector 
(X",y") G T"(Xy), with high probabiUty. This is not true in general: if we have two pairs of sequences 
(x"z") G rj^(XZ) and (z"y") G T^{ZY), it is not always the case that (x"z"y") G Tl'{XZY), and 
therefore that (x"y") G T" {XY^ ; that is, joint typicality is not a transitive relation. However, ii X — Z — Y 
forms a Markov chain, and then only in a high probability sense, said transitivity property holds. 

2) A Converse Statement: We are interested in a converse form of the Markov lemma. Suppose we are 
given an arbitrary distribution p{xyz), whose typical sets satisfy the constraints imposed by the Markov 
lemma: can we say that p itself must be a Markov chain? It turns out the answer is almost yes - if some 
arbitrary distribution p induces typical sets like those of a Markov chain, then there must exist a Markov 
chain p' within Li distance 2e of p. This statement is made precise in the following lemma. 



Lemma 2 (Reverse Markov): Fix 


n, e > 0. 


Consider any 


distribution p{xyz) 


for which 


for some z". 


T^{X z 


")NxT," 


{Yz-)[p] - 


= T^{XYz-)[p\. 








Define a Markov chain p'{xyz) 


= p{z)p{x 


\z)p{y\z), with the components 


P{z), 


p{x\ 


z) and p{y\z) 


taken from the given p{xyz). Then, p — p 


' ^ < 2e. 








D 



Proof. Consider any z" for which T"(xy |z")[j)] ^ 0. Since p' is a Markov chain, from the direct form 
of the Markov lemma we know that 

rf(x|z")[p']xr,"(y|z")[/] c,, t:^{xy\zP)[p']- 

and clearly, / rj^(Xy|z") [p] = rj^(X|z")[p] x rj^(y|z")[p] = rj^(X|z")[p'] x T^ {Y\z'^)\p'], since 
we choose p' to coincide with p on the corresponding marginals, and from our choice of z". So, this last 
inclusion can be written as 

T:{X\z-)[p]^T^{Y\z^^)[p] C, T:^{XY\z-)[p'], 

and therefore we see that 

/ T," (X I z") [p] X T," (y I z") [p] C,, T," {XY \ z") [p] D T^ {XY \ z") \p'] ; 

thus, there must exist at least one triplet of sequences (x"y"z") that is jointly typical under both p and p'. 
So for these particular sequences, it follows from the definition of strong typicality that both 

yxyz:\^N{xyz;x''y''z'')-p{xyz)\ < ix\\y\\z\ ^^'^ Vxyz : j^X (xyz;x''y"z") -p'(xyz)| < ix\\y\\z\ ^ 
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and therefore the Li norm of p — p' can be written as 

xyz 

= J2 H'^y') - ^A^(x2/^;x"y"z-) + iiV(xyz;x"y"z-) -p'{xyz)\ 

xyz 

< ^|iiV(xyz;x"y"z") -p(xyz)| + ^ |iiV(xyz;x"y"z") -p'ixyz)\ 
xyz xyz 

< 2e, 

thus proving the lemma. ■ 

Our interest in this question stems from the fact that, from the requirement to cover a product source with 
distributed typical sets, we do get constraints on the shape of various typical sets. So we need to characterize 
what distributions can give rise to those sets, and this lemma plays an important role in that. 

C. Upper Bounds on the Size of Distributed Typical Cover Elements 



Lemma 3: Consider any (2 


nRi^2"-^%n,e 


,Di,L 


'2) 


distributed rate-distortion code. 


represented by 


a 


cover S. Then, there exists 


a distribution 


vr G P 


LB 


such that, for all {i,j) 


G{1. 


.2' 


'^1} X {1. 


_2nR 


} 


and all e > 0, 






















Sij n Tf (xy) 


< 


2n{H{XY\XY)[n]+e)^ 












provided n is large enough. 


Furthermore, 


for all 


y'' 
< 


' G y", 

2n{H{X\XYY)[7T]+e') 












and similarly for all x" G X"-, 




















S2jnrf(y 


X") 


< 


2n{H{Y\XYX)[Tr]+e") 












also provided n is large enough. 
















D 



Proof. From the two-terminal rate-distortion theorem [14, Thm. 2.2.3], we know there exists a distribution 



< Di and E 



d2(Y,Y) 



< D2, and 



p{xyxy) = p{xy)p{xy\xy), with p{xy) the given source, E di[X,X) 

sequences x"(ij) and y"(«j) such that, for all (i,j) G {1...2"^i} x {I. .2'^^^} and all e > 0, 

Si,- c r,"(xy|x"(ii)r(u)), (3) 

provided n is large enough. But since for distributed codes we have Sjj = [Si^j x S2,j] n r"(Xy), it 
follows from standard properties of typical sets that 

Si,nT,"(X|S2,,) C Tf(X|x"(ij)r(ij)) and S2jnrj^(y|Si,i) C T^{Y\±^{ij)r{ij)). 
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Consider now a new cover S', having the property that 

s;,nrf(x|s^,,.) = T:^{x\^^{ij)rm and s^,,.nrf(x|sy = rr(y|i"(u)r(u))- 

A simple expression for the cover element S'^ ^ is obtained as follows. Fix an index i G {1...2"^i}: 



Vfc:s;^,,nrj^(x|s'2,,) 
Ur=?s;,,nrr(x|s'2,,) 
sUnur=?r,"(x|s'2,,) 



r,"(X|x"(ifc)y"(ifc)) 



and since P(S'"y(X)) > 1 — e, S'j^ ^ is determined up to a set of vanishing measure; similarly, fixing 
J G {1...2"^^}, we get S'2,^. n 5,"^(y) = U/=? Tf (y|x"(/j)y"(/j)). 
The new cover 5' has some useful properties: 

. for all (i, j), Si,i n 5^"y(X) C s;^i n S^y{^) and Szj n 5^"x(^) ^ S^^^. n 5^"x(^)' and therefore 

Sjj ^ S^ as well, by construction; 
. for all (x"y") G S^, di(x",i"(ii)) < Df and d2(y",y"(ii)) < D^, from the joint typicality 

conditions defining S'^^ ^ and Sj^; 
. andp(u,S^^.)>p(u,S.,)>l-6; 
so, 5' "dominates" 5 (in that every element in 5 is contained in one element of 5'), and 5' satisfies the 
same distortion constraints that S does. Therefore, an upper bound on the size of the elements in the new 
cover 5' is also an upper bound on the size of the elements in the given cover S. 
Next we observe that new cover element S', can be "sandwiched" in between two other terms: 

T;(x|x-(ij)y"(u)) X r-(y|x"(ij)r (u))] n t^{xy) c [s;, x s'2,,] n t^{xy) 

(fe) 



c 



rj^(xy|i"(zi)r(u)). 



where (a) follows from our choice of S'^ j and S2 j, and from elementary algebra of sets; and (b) follows 
from eqn. (3), and from the product form of distributed covers. So, since the other inclusion always holds, 



T^{x\i,^{ij)rm X T;(y|x"(ij)r (u)) 



nr" xy 



T:^{XY\i.^{^j)rm 



is a necessary condition on any suitable distribution p{xyxy) whose typical sets can be used to construct 
the cover S'; or equivalently, since this must hold for every (i, j). 



rf(X|i"y") xT;^(y|i"y") 



nTr(XY) 



r;^(xy|x"y"), 
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for any sequences x" and y" such that r"(Xy|x"y") ^ 0. Finally we note that this last condition is 
equivalent to 

rj^(X|x"y") xTf(y|x"y") = T,"(Xy|x"y"). (4) 

This is because this last equality already forces any x" € T"(X|x"y") and y" S T"(y |x"y") to be jointly 
typical. Therefore, from the reverse Markov lemma, we conclude there exists a distribution TT{xyxy), which 
satisfies a Markov chain of the form X — XY — Y, such that | |p — 7r| | < 2e. 



Next we observe that if lip — vr 1 1 < 2e, then conditionals and marginals of p and of vr are also close. 

Consider, for example, Pj^yi^y) = Y^xyPxYXvi^y^y) ^"'^ ^xy(^2^) = T^xy'^xYXvi^y^y)' 
Ibxy(-) -^xy(-)||i = ^\PxY^xy)-T:^y{xy)\ 

xy 

= Yl I {^PxYXYi^'y'^y)) - ( Yl ^xYXYi^"y"^y) 

xy x'y' x"y" 



= Z] I X] PxYXY i^y^y) -T^xYXY i^y^y) 

xy xy 

^ Yl \PxYXY (xy^y) - T^xYXY i^y^i)) I 



xyxy 

< 2e. 



For the conditional p^y|^y(2;y|xy): 

\\pxY\xY(-\^y)-^xY\xY(-\^y)\\i = Y\PxY\xY(^y\^y)-PxY\xY(^y\^y)\ 

xy 

\PxYXY i^y^y) '^XYXY i^y^y) \ 



xy 



Pxyi^y) T^XY^Xy) I 

Y \pxYXY^^y^y)'^xY(^y) - '^xYXY(^y^y)PxY(^y)\ 

xy 

< „ .A.-.N, , , .^,-A 2^ \pxYXY(.^y^y)PxY(.^y) + PxYxvi^y^y)'^^ ~ '^xYXY(^y^y)PxY(^y)\ 

xy 

< y^^ y^^ ^ [2epxy<^y{xyxy) + p ^^{xy)\p j^y XY^^y^y) " ^^xYXY^^y^v) 

xy 

f '2epj^y{xy) +p^^{xy) Y^ \p^y^^{xyxy) - TT^y^^{xyxy)\ ] 

T^xvi^y) 



PxY(^y)-^XY 


(£y) 




1 




PxY(^y)'^XY 


{xy) 




1 




PxY{S:y)-K<iy 


{xy) 




1 




Pxi-i^y^xY 


{m 




4e 





xy 

Af 
< 



A 

= ei, 
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where (a) follows from the Li bound on the marginals pj^^ and vr^y above; and provided both p-^yi^y) t^ *-* 
and 7r^y(xy) / 0. We also note that under the assumption that ||Pxyxy ~ ^xyxrlli ^ ^^' there exists a 
value e such that, for all < e < e, it is not possible to have a pair (xoyo) such that Pj^y(£oyo) > but 



vr 



XY 



,xoyo) 



0, or vice versa. This is because vr 



XY 



xoVq) 



means that for all xy, vr 



XYXY 



(xyxoyo) = 0. 



But if Pxyi^oyo) > 0, this means there exists at least one xpyo such that PxYXY^^oUo^oyo) > 0, and 
as a result, | \pxyxy ~ '^xyxyWi - PYyxy(^o?/o^oyo); thus, setting e = PxYxfi^oyo^oyo)^ we get the 
sought contradiction. Thus, for all e small enough, the bound on the conditionals holds as well, and so we 
have from [12, Thm. 16.3.2] that 



ei 



l^liyil'^113'1 



£2, 



H{XY\X = x,Y = y)lp]-H{XY\X = x,Y = y)[Tr] < -eilog 
and so, 

H{XY\XY) [p] - H{XY\XY) [it] 
^ E \pxYi^m{XY\X = x,Y = y) [p] - 7Tj^y{xy)H{XY\X = x,Y = y) [vr] 



Pxy{x*y*)H{XY\X = x*,Y = y*)[p] - 7Tjiy{x*r)H{XY\X = x*,Y = r)M 

7rj^^{x*y*)H{XY\X = x* ,Y = y*)[p] + 2eH{XY\X = x*,Y = y*)\p] 

- Trj^y{x*r)H{XY\X = x*,Y = y*)[7r] 

2eH{XY\X = X* ,Y = y*)[p] 

+ Trxy{x*y*){H{XY\X = x* ,Y = y*)[p] - H{XY\X = x* ,Y = y*) W 

2eH{XY\X = x*,Y = y*)\p] +PxY(^*y*)e2 



(5) 





xy 




(a) 
< 


X 


y ■ 


{b) 
< 


X 


y ■ 



< 



\x\ ■\y\ 
\x\ -Ij)] 

A 

= £3, 

where (a) follows from choosing x*y* as the pair xy G Xxy that makes the difference \pxY{xy)H(^XY\X = 
x,Y = y)[p] — Trj^^{xy)H (^XY\X = x,Y = y)[-iT]\ largest; (b) follows from \\pxy ~ ^xylli ^ ^^' ^'^'^ 
(c) follows from eqn. (5) above, and from the triangle inequality. 

We conclude this part of the proof by noting that completely analogous arguments can be made to show 
that 



H{X\XYY)[p] - H{X\XYY)[tt] 



< £4 



and 



H{Y\XYX) [p] - H{Y\XYX) [vr] 



< £5. 



We are now ready to prove our desired bounds. 
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Since for all {i,j), S,,- C S^ = Tp{XY\±^{ij)y^{ij)), 

therefore, choosing e = e + £3, the first bound specified by the lemma follows. 

For the other two bounds, fix now y" € 3^". Since 5 is a cover, there must exist at least one value 

Jo G {1...2"-f^^}, such that y" G S2j„. So consider any i € {1...2"^i}, and assume Si,i n T^{X\y") ^ 0; 
based on this assumption, pick any x" G Si^j n T"(X|y"). This means that (x"y") G [Si^j x S2J0] ^ 
T^{XY), and therefore that (x"y") G [S; ^ x S'a.^J n T^{XY), and hence from eqn. (3) we have that 
(x"y"x"(ijo)y"(uo)) e T^{XYXY), and therefore we conclude that 

Si,nr,"(x|y") c r;'(x|x"(.io)r(uo)y"). 

We also note that if Si^j fl T"(X|y") = 0, then the last inclusion holds trivially. Thus, 

Therefore, choosing e' = e + £4, the second bound specified by the lemma holds. And the third (and last) 
bound follows from an argument identical to this last one. So the lemma is proved. ■ 

IV. Proof of Theorem 1 
Consider any (^.^^^l"^^^ ,n, e, Di, D2) distributed rate-distortion code, represented by a cover S. Then, 

n{Ri + R2) > //(/i(X")/2(r")) 

= /7(/i(x")/2(y")) -/7(/i(x")/2(y")|x"y") 

= /(x"y"A/i(x")/2(y")) 

= //(x^y") -i/(x"y"|/i(x")/2(y")) 

= nH{xY)- Yl p(/i(x")=.,/2(y")=j)//(x-y"|/i(x") = i,/2(y")=j) 

l<i<2"«i,l<j<2"«2 

> nHiXY]- max F(X"y"|/i(X") = i, /2(y") = j) 

l<i<2"«i,l<j<2"«2 ^ ' ^ 

^ P(/i(x") = i,/2(y")=j)" 

l<i<2"«i,l<j<2"«2 

nH{XY) - max i/(X"y"|/i(X") = i, ^(y") = j) 

^ l<i<2"«i,l<j<2''-R2 ' ^ 

" nei 



(a) 




r ~ 1 


> 


nF(xy) - 


max log S,o 

. l<i<2"«i,l<j<2"«2 -^ . 





> nF {XY) - nH {XY \ XY) [tt] - ni - nei 
= nl (XY A XY) [tt] - ni - nei , 
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where (a) follows from splitting outcomes of X"'Y"' into typical and non-typical ones, and from bounding 
the entropy of the typical ones with a uniform distribution; and (b) follows from Lemma 3, for some vr e Plb- 
For the individual rates, we have the following chain of inequalities: 

nR, > H{h{X^)) 

= /7(/i(x")|y") -/7(/i(x")|x"y") 
= i7(x"|F") -//(x"|/i(x")y") 

= nH{X\Y)- Y^ ^p(/i(X") = i,y" = y")//(X"|/i(X") = i,y" = y") 



> nH(X\Y)-\ max iJ(X'*|/i(X") = i, F" = y") 

2"«i 

= nF(X|y) - max /7(X"|/i(X") = i, F" = y") 



(a) 

> nH(XY) 



max log2|Si,,nTJ^(X|y")| 

Li=1...2"«i,y"ey" ' ^ ' ^' 



nei 



> nF (X I y) - ni7 {X I Xy y) [vr] - ne' - nei 

= n/(XAXy|y)[7r] -ne'-nei, 

where (a) follows from splitting the outcomes of X" into those that are jointly typical with the given sequence 
y" and those that are not, and from bounding the entropy of the typical ones with a uniform distribution; 
and (b) follows from Lemma 3. An identical argument shows that ni?2 > nl(Y A Xy |x) [vr] — ne" — nei. 
And since these conditions must hold for all e > 0, the theorem follows. ■ 

V. Discussion 
We conclude the first part of this paper with some discussion on the results proved so far. 
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A. Finite Parameterization ofTZ°{Di,D2) 

The class of distributions used to define the Berger-Tung inner bound is given by: 

• pixy) = J2uvPxYuv{xyuv) 
PXYUV • U — X — Y — Visa. Markov chain } , 

. E[di{X,ji{U,V))] <Di and ¥.[d2{Y,^2{U,V))] < D2 ^ 
for fixed distortions [Di, D2), source p{xy), and some functions '^i : U y. V ^ X and 72 : ^/ x V ^ 3^- To 
make a direct comparison with Pbt easier, we rewrite Plb in terms of two variables U and V as follows: 

. Set Z^ = i- and V = V. 

. For any Pxyxy ^ ^lb, set pxYUv{xyuv) = PxvxY^^y^y)- 
Then, it is clear that P[g, defined by 

• p{xy) = J2uvPXYUv{xyuv) 
PXYUV • X — UV — y is a Markov chain > , 

. E [di(X,7i([/,y))] < Di and E [d2{Y,^2{U,V))] < D2 ^ 
again for fixed distortions {Di, D2), source p{xy), and some functions 71 -.UyiV ^ X and 72 : Z// x V ^ 3^, 
is just a relabeling of Plb- 

In terms of these sets, we can state the following bounds on TZ*{Di,D2): 

U n{DuD2,p) C n*{Di,D2) C J n{Di,D2,p). (6) 

pePBT peP^B 

lZ*{Di,D2) is not a characterization of the region of achievable rates that we would normally consider 

satisfactory, in that it is not "computable," in the sense of [14, pg. 259]. Yet with eqn. (6), we have managed 

to "sandwich" the uncomputable TZ*{Di,D2) region in between two other regions, both of which are 

computable: 

• in P[g, U and V are taken over finite alphabets (U = X and V = 3^); 

• and in Pbt, although we have not been able to find anywhere in the literature a proof that the cardinality 
of U and V must be finite, presumably a direct application of the method of Ahlswede and Korner 
should produce the desired bounds [1], [16]. 

This is of interest because, as far as we can tell, none of the outer bounds we have found in the literature 
are computable. 



B. Relationship to the Berger-Tung Outer Bound 

One simple sufficient condition (which unfortunately does not hold) for proving the inclusions in eqn. (6) 
to be in fact equalities would have been to show that P[^g C Pb^. However, a direct comparison among 
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these two sets is still revealing. Consider any distribution p that satisfies the constraints of both sets (i.e., 
p G Plb HPbt), and elements xyuv for which p{xyuv) ^ 0. Then, this p admits two different factorizations: 

p{uv)p{x\uv)p{y\uv) = p{xy)p{u\x)p{v\y) 
^ p(™)HM£}pMpMiMi) = p{xy)p{u\x)p{v\y) 

■^ p{uv\x)p{x)p{uv\y)p{y) = p{xy)p{u\x)p{v\y)p{uv) 

<^ p{u\x)p{v\x)p{x)p{u\y)p{v\y)p{y) = p{xy)p{u\x)p{v\y)p{uv) 

<^ p{v\x)p{x)p{u\y)p{y) = p{xy)p{uv) 

<^ p{xv)p{yu) = p{xy)p{uv). 

Clearly, any distribution in this intersection must make all variables pairwise independent: integrate any two 
of them, the other two can be expressed as the product of their marginals. 

We find this observation interesting because it provides clear evidence that our lower bound is very 
different in nature from the Berger-Tung outer bound [4], [19]. In that bound, the set of distributions in the 
outer bound (all Markov chains of the form U — X — Y and X — Y — V) strictly contains Pbt; that means, 
there is a subset of the distributions in the outer bound that generates all rates we know to be achievable. 
In our bound, since P^b H Pbt is a degenerate set, none of the distributions in p G Plb can be used to define 
a code construction based on known methods,^ such as the "quantize-then-bin" strategy used in the proof 
of the Berger-Tung inner bound. 

C. Computation of the Outer Bound 

The finite parameterization of our outer bound is an important contribution in itself we believe, given 
the fact that the Berger-Tung outer bound is not computable.^ This is of interest in part because, at least in 
principle, this finite parameterization renders the problem amenable to analysis using computational methods. 
Finding an efficient algorithm for computing solutions to the optimization problem defined by Theorem 1, 
similar in spirit to the Blahut-Arimoto algorithm for the numerical evaluation of channel capacity and rate- 
distortion functions [2], [9], certainly is an interesting challenge in its own right. 

More fundamentally though, we believe the computability of our bound holds the key to complete a proof 
of the optimality of the Berger-Tung inner bound for the problem setup of Fig. I: 

• Computational methods are of interest not only because they lead to answers that are "useful in practice;" 
discovering efficient algorithms invariably requires the uncovering of structure in the problem. A good 

^Except of course for trivial cases, such as when the two sources X and Y are independent, and the distortion is maximum. 
*And neither is the more modern outer bound of Wagner and Anantharam [20], [21], also mentioned in the introduction. 
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example in our field: the characterization by Chiang and Boyd of the Lagrange duals of channel capacity 

and rate-distortion as convex geometric programs [10]. 
• Last but not least, an efficient algorithm to compute the sandwich terms in eqn. (6) provides a fallback 

strategy. If all else fails, at least by means of numerical methods we can check whether, in concrete 

instances of the problem, the lower and upper bounds coincide or not. 
The achievability of the set of rates defined by Theorem 1 , and the effective computation of the bounds of 
eqn. (6), are the main topics considered in Part II. 

Acknowledgements-ln the final version. 
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